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Abstract 

We study computing the convolution of a private input x with a public 
input h, while satisfying the guarantees of (s, ^-differential privacy. Con- 
volution is a fundamental operation, intimately related to Fourier Trans- 
forms. ,In our setting, the private input may represent a time series of 
sensitive events or a histogram of a database of confidential personal infor- 
mation. Convolution then captures important primitives including linear 
filtering, which is an essential tool in time series analysis, and aggregation 
queries on projections of the data. 

We give a nearly optimal algorithm for computing convolutions while 
satisfying (e, ^-differential privacy. Surprisingly, we follow the simple 
strategy of adding independent Laplacian noise to each Fourier coefficient 
and bounding the privacy loss using the composition theorem from |10| . 
We derive a closed form expression for the optimal noise to add to each 
Fourier coefficient using convex programming duality. Our algorithm is 
very efficient - it is essentially no more computationally expensive than 
a Fast Fourier Transform. To prove near optimality, we use the recent 
discrepancy lowerbounds of [23] and derive a spectral lower bound using 
a characterization of discrepancy in terms of determinants. 



1 Introduction 

The noise complexity of linear queries is of fundamental interest in the theory 
of differential privacy. Consider a database that represents users (or events) of 
N different types (in the case of events, a type is a time step). We may encode 
the database as a vector x indexed by {1, . . . , N}, where Xi gives the number of 
users of type i. A linear query asks for the dot product (a, x); a workload of M 
queries is given as a matrix A, and the intended output is Ax. As the database 
often encodes personal information, we wish to answer queries in a way that 
does not compromise the individuals represented in the data. We adopt the 
now standard notion of (e, 5) -differential privacy [3]; informally, an algorithm 
is differentially private if its output distribution does not change drastically 
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when a single user/event changes in the database. This definition necessitates 
randomizition and approximation, and, therefore, the question of the optimal 
accuracy of any differentially private algorithm on a workload A comes into the 
center. We discuss accuracy in terms of mean squared error as a measure of 
approximation: the expected average of squared error over all M queries. 

The queries in a workload A can have different degrees of correlation, and 
this poses different challenges for the private approximation algorithm. In 
one extreme, when A is a set of Cl(N) independently sampled random {0, 1} 
(i.e. counting) queries, we know, by the seminal work of Dinur and Nissim [7J, 
that any (e, <5)-differentially private algorithm needs to incur at least Q(N) 
squared error per query on average. On the other hand, if A consists of the 
same counting query repeated M times, we only need to add O(l) noise per 
query [8]. While those two extremes are well understood - the bounds cited 
above are tight - little is known about workloads of queries with some, but not 
perfect, correlation. 

The convolutior^ of the private input x with a public vector h is defined as 
the vector y where 



This convolution map is a workload of N linear queries. Each query is a circular 
shift of the previous one, and, therefore, the queries are far from independent 
but not identical either. Convolution is a fundamental operation that arises in 
algebraic computations such as polynomial multiplication. It is a basic operation 
in signal analysis and has well known connection to Fourier transforms. Of 
primary interest to us, it is a natural primitive in various applications: 

• linear filters in the analysis of time series data can be cast as convolu- 
tions; as example applications, linear filtering can be used to isolate cycle 
components in time series data from spurious variations, and to compute 
time-decayed statistics of the data; 

• when user type in the database is specified by d binary attributes, aggre- 
gate queries such as fc-wise marginals and generalizations can be repre- 
sented as convolutions. 

Privacy concerns arise naturally in these applications: the time series data can 
contain records of sensitive events, such as financial transactions, records of user 
activity, etc.; some of the attributes in a database can be sensitive, for example 
when dealing with databases of medical data. 

We give the first nearly optimal algorithm for computing convolution un- 
der (e, (S)-differential privacy constraints. Our algorithm gives the lowest mean 
squared error achievable by adding independent (but non-uniform) Laplace noise 
to the Fourier coefficients of x and bounding the privacy loss by the composition 

x Here we define circular convolution, but, however, as discussed in the paper, our results 
generalize to other types of convolution, which arc defined similarly. 
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theorem of Dwork et al. [TU]. Using complementary slackness conditions, we de- 
rive a simple closed form for the optimal amount of error that should be added 
in the direction of each Fourier coefficient. We prove that, for any fixed h, up to 
polylogarithmic factors, any (e, S) -differential private algorithm incurs at least 
as much squared error per query as our algorithm. Somewhat surprisingly, our 
result shows that the simple strategy of adding indepdendent noise in the Fourier 
domain is nearly optimal for computing convolutions. Prior to our work there 
were known nearly instance-optimaH (e, (5)-differentially private algorithm for a 
natural class of linear queries. Additionally, our algorithm is simpler and more 
efficient than related algorithms for (e, 0)-differential privacy. 

To prove optimality of our algorithm, we use the recent discrepancy-based 
noise lower bounds of Muthukrishnan and Nikolov [23]. We use a character- 
ization of discrepancy in terms of determinants of submatrices discovered by 
Lovasz, Spencer, and Vcsztergombi, together with ideas by Hardt and Talwar, 
who give instance-optimal algorithms for the stronger notion of (e, 0)-differential 
privacjU. A main technical ingredient in our proof is a connection between the 
discrepancy of a matrix A and the discrepancy of PA where P is an orthogonal 
projection operator. 

In addition to applications to linear filtering, our algorithm allows us to 
approximate marginal queries encoded by w-DNFs, which generalize fc-wise 
marginal queries. Using concentration results for the spectrum of bounded- 
width DNFs, we derive a non-trivial error bound for approximating u>-DNF 
queries. The bound is independent of the DNF size. 

Related work. The problem of computing private convolutions has not 
been considered in the literature before. However, there is a fair amount of 
work on the more general problem of computing arbitrary linear queries, as well 
as some work on special cases of convolution maps. 

The problem of computing arbitrary linear maps of a private database his- 
togram was first considered in the seminal work of Dinur and Nissim [7] . They 
showed that privately answering M random 0-1 queries on a universe of size 
N requires ft(N) mean squared error as long as M = Q(N), and this bound is 
tight. These bounds do not directly apply to our work, as a set of independent 
random queries is not likely to encode a circular convolution. Nevertheless, one 
can show, using spectral noise lower bounds, that a convolution with a random 
0-1 vector h requires assymptotically as much error as N random queries. Yet, 
many particular convolutions of interest require much less noise. This fact mo- 
tivates us to study algorithms for approximating the convolution x * h which 
are optimal for any given h. An efficient algorithm with this kind of instance 
per instance (in terms of h) optimality gaurantce obviates the need to develop 
specialized algorithms. Next we review some prior work on special instances of 
convolution maps and also related work on computing linear maps optimally. 

Bolot et al. [3] give algorithms for various decayed sum queries: window 

2 Note that instance-optimality here refers to the query vector h, while we still consider 
worst-case error over the private input x. 

3 Note that establishing instance-optimality for (e, <5)-diffcrcntial privacy is harder from 
error lower bounds perspective, as the privacy definition is weaker. 
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sums, exponentially and polynomially decayed sums. Any decayed sum func- 
tion is a type of linear filter, and, therefore, a special case of convolution. Thus, 
our current work gives a nearly optimal (e, ^-differentially private approxima- 
tion for any decayed sum function. Moreover, as far as mean squared error 
is concerned, our algorithms give improved error bounds for the window sums 
problem: constant squared error per query. However, unlike [3], we only con- 
sider the offline batch-processing setting, as opposed to the online continual 
observation setting. 

The work of Barak et al. [J on computing fc-wise marginals concerns a 
restricted class of convolutions (see Section [5]). Moreover, Kasiviswanathan [US] 
show a noise lower bound for fc-wise marginals which is tight in the worst case. 
Our work is a generalization: we are able to give nearly optimal approximations 
to a wider class of queries, and our lower and upper bounds nearly match for 
any convolution. 

Li and Miklau [TSHIl] proposed the class of extended matrix mechanisms, 
building on prior work on the matrix mechanism [17) . and showed how to effi- 
ccntly compute the optimal mechanism from the class. Furthermore, indepen- 
dently and concurrently with our work, Cormode et al. [BJ considered adding 
optimal non-uniform noise to a fixed transform of the private database. Since 
our mechanism is a special instance of the extended matrix mechanism, the algo- 
rithms of Li and Miklau have at most as much error as our algorithm. However, 
similarly to [BJ, we gain significantly in efficiency by fixing a specific transform 
(in our case the Fourier transform) of the data and computing a closed form 
expression for the optimal noise magnitudes. Unlike the work of Li and Mik- 
lau and Cormode et al., we are able to show nearly tight lower bounds for any 
differentially private algorithm (not just the extended matrix mechanism) and 
any set of convolution queries. Therefore, we can show that the choice of the 
Fourier transform comes without loss of generality for any set of convolution 
queries. 

In the setting of (e, 0)-differential privacy, Hardt and Talwar [T5] prove nearly 
optimal upper and lower bounds on approximating Ax for any matrix A. Re- 
cently, their results were improved, and made unconditional by Bhaskara et 
al. [2]. Prior to our work a similar result was not known for the weaker notion 
of approximate privacy, i.e. (e, ^-differential privacy. Subsequently to our work, 
our results were generalized by Nikolov, Talwar, and Zhang |24j to give nearly 
optimal algorithms for computing any linear map A under (e, <5)-differential pri- 
vacy. Their work combined our use of hereditary discrepancy bounds on error 
through the determinant lower bound with results from assymptotic convex ge- 
ometry The algorithms from [2j[15] are computationally expensive, as they 
need to sample from a high-dimensional convex bodjQ. Even the more efficient 
algorithm from [51] has running time f2(7V 3 ), as it needs to approximate the 
minimum enclosing ellipsoid of an ./V-dimensional convex body. By contrast our 
algorithm's running time is dominated by the running time of the Fast Fourier 

4 One of the best known algorithms is due to Lovasz and Vcmpala 1211 and, ignoring other 
parameters, makes ©(TV 3 ) calls to a separation oracle, each of which would require solving a 
linear programming feasibility problem. 
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Transform, i.e. 0(N log N), making it more suitable for practical applications. 
Also, for some sets of queries, such as running sums, our analysis gives tighter 
bounds than the analysis of the algorithm in [21] . 

A related line of work seeks to exploit sparsity assumptions on the private 
database in order to reduce error; as we do not limit the database size, our results 
are not directly comparable. Using our histogram representation, database size 
corresponds to the norm |)x||i where x is the database in histogram representa- 
tion. For general linear queries, the multiplicative weights algorithm of Hardt 
and Rothblum achieves mean squared error 0{n^J\og N) for ||x||i < n. This 
bound is nearly tight for random queries, but can be loose for special queries 
of interest. For example, running sums require noise 0(\og ^ N), which is 
less than n except for n very small in the universe size. In general, algorithms 
which bound database size in order to bound error become less useful when 
database size is large compared to the total number of queries, and for very 
large databases algorithms such as ours are still of interest. This is true also for 
the line of algorithms for marginal queries which give error an arbitrary small 
constant fraction of the database size [S1H31II11I2S] ■ Note further that the opti- 
mal error for a subset of all marginal queries may be less than linear in database 
size, and our algorithms will give near optimal error for the specific subset of 
interest. 

Organization. We begin with preliminaries on differential privacy and 
convolution operators. In section [3] we derive our main lower bound result, and 
in section [4] we describe and analyze our nearly optimal algorithm. In section [5] 
we describe applications of our main results. 

2 Preliminaries 

Notation: M, R, and C are the sets of non-negative integers, real, and complex 
numbers respectively. By log we denote the logarithm in base 2 while by In 
we denote the logarithm in base e. Matrices and vectors are represented by 
boldface upper and lower cases, respectively. A T , A*, A H stand for the trans- 
pose, the conjugate and the transpose conjugate of A, respectively. The trace 
and the determinant of A are respectively denoted by tr(A) and dct(A). A m: 
denotes the m-th row of matrix A, and A :n its n-th column. A|s, where A is a 
matrix with iV columns and S C [TV] , denotes the submatrix of A consisting of 
those columns corresponding to elements of S. Aa(1), • • • , Aa(^) represent the 
eigenvalues of an n x n matrix A. I at is the identity matrix of size N. E[-] is 
the statistical expectation operator. Lap(x, s) denotes the Laplace distribution 
centered at x with scale s, i.e. the distribution of the random variable x + n 
where 77 has probability density function p(y) oc exp(— |2/|/s). 

2.1 Convolution 

In this section, we first give the definition of circular convolution. We then re- 
call important results on the Fourier eigen-decomposition of convolution. Gen- 
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eralization to other notions of convolution and applications are discussed in 
Section [5] 

Let x = {xq, . . . , xn-i} be a real input sequence of length N, and /;, = 
{hg, . . . , Hn-i} a sequence of length N. The circular convolution of x and h is 
the sequence y = x * h of length N defined by 



Vk 



N-l 



/ t Xnh(k-n) mod 2Vi Vfc £ {0, 
n=0 



,N-1}. 



(1) 



Definition 1. The N x N circular convolution matrix H is defined as 



H 



ho 
hi 
h- 2 



/'() 



W-2 



/'I 



W-2 



^0 /lJV-1 



NxN 



This matrix is a circulant matrix with first column h = [/iq, • ■ • , /iat_i] t £ R , 
and its subsequent columns are successive cyclic shifts of its first column. Note 
that H is a normal matrix (H.H H = H^Hj. 

Define the column vectors x = [xq, . . . , xn-i] t £ R^, and y = [yo, . . . , wat-i] t £ 
Mr, The circular convolution (JlJ can be written in matrix notation y = Hx. 
In Section 12.21 we recall that circular convolution can be diagonalized in the 
Fourier basis. 



2.2 Fourier Eigen-decomposition of Convolution 

In this section, we recall the definition of the Fourier basis, and the eigen- 
decomposition of circular convolution in this basis. 

Definition 2. The normalized Discrete Fourier Transform (DFT) matrix of 
size N is defined as 



• N 



1 / 7'27r m n\ , 

CX P \~-^F-)\ ( 2 ) 



N V N 



m,ne{Q,...,N-l} 

Note that Fjy is symmetric (TV = Fjf) and unitary (F jvFj^ = F^Fjy = In)- 



We denote by f m = [1, e n , . . . , e J « ■ ] T g C w the m-th column of 
the inverse DFT matrix Fjy. Or alternatively, fj^f is the m-th row of Fjy. The 
normalized DFT of a vector h is simply given by h = Fjyh. 

Theorem 1 ( [12]). Any circulant matrix H can be diagonalized in the Fourier 
basis Fat; the eigenvectors of H are given by the columns {fm}me{0,...,JV-i} °f 
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the inverse DFT matrix Fjy, and the associated eigenvalues {A rn } me {o n-i} 
are given by yNh, i.e. by the DFT of the first column h of H: 

Vme{0,...,JV-l}, Hf m = A m f m 

N-l 

where X m = VNh m = h n e « . 

n=0 

Equivalently, in the Fourier domain, the circular convolution matrix H becomes 
a diagonal matrix H = diagjvATi}. 

Corollary 1. Consider the circular convolution y = Hx of x and y. Let 
x = Fatx and h = Fjvh denote the normalized DFT o/x and h. In the Fourier 
domain, the circular convolution becomes a simple entry-wise multiplication of 
the components of y/Nh with the components of x: y = F^r y = H x. 

2.3 Privacy Model 

2.3.1 Differential Privacy 

Two real-valued input vectors x,x' £ [0, 1] N are neighbors when ||x — x'||i < 1. 

Definition 3. A randomized algorithm A satisfies (e, S) -differential privacy if 
for all neighbors x, x' G [0,1]™, and all measurable subsets T of the support of 
A, we have 

Pr[A(x) e T] < e e Pr[^(x') e T] + 8, 
where probabilities are taken over the randomness of A. 

2.3.2 Laplace Noise Mechanism 

Definition 4. A function f : [0,1]^ — > C has sensitivity s if s is the smallest 
number such that for any two neighbors x, x' £ [0, 1]^, 

|/(x)-/(x')|< S . 

Theorem 2 ( Sj). Let f : [0,1]^ —> C have sensitivity s. Suppose that on 
input x, algorithm A outputs /(x) + z, where z ~ Lap(0, s/e). Then A satisfies 
(e, 0)- differential privacy. 

2.3.3 Composition Theorems 

An important feature of differential privacy is its robustness: when an algorithm 
is a "composition" of several differentially private algorithms, the algorithm 
itself also satisfies differential privacy constraints, with the privacy parameters 
degrading smoothly. The results in this subsection quantify how the privacy 
parameters degrade. 

The first composition theorem is an easy consequence of the definition of 
differential privacy: 
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Theorem 3 ( [5]). Let Ai satisfy (ei, Si) -differential privacy and A2 satisfy 
(£2, 62) -differential privacy, where A2 could take the output of Ai as input. 
Then the algorithm which on input x outputs the tuple (Ai(x.),A2(Ai(x),x.)) 
satisfies (ei + £2, <5i + 62) -differential privacy. 

In a more recent paper, Dwork et al. proved a more sophisticated compo- 
sition theorem, which often gives asymptotically better bounds on the privacy 
parameters. Next wc state their theorem. 

Theorem 4 ( [TO])- Let A\, . . ., At be such that algorithm Ai satisfies (ej,0)- 
differential privacy. Then the algorithm that on input x outputs the tuple 
(Ai (x), . . ., ylfe(x)) satisfies (e, 5) -differential privacy for any 5 > and 



2.4 Accuracy 

In this paper we arc interested in differentially private algorithms for the con- 
volution problem. In the convolution problem, we are given a public sequence 
h = {hi, . . . , /ijv} and a private sequence x = {x\, . . . ,%n}- Our goal is to 
design an algorithm A that is (e, (5)-diffcrcntially private with respect to the 
private input x (taken as column vector x), and approximates the convolution 
h * x. More precisely, 

Definition 5. Given a vector h £ R N which defines a convolution matrix H, 
the mean (expected) squared error (TVISEj of an algorithm A is defined as 



Note that MSE measures the mean expected squared error per output com- 
ponent. 

3 Lower Bounds 

In this section we derive a spectral lower bound on mean squared error of dif- 
ferentially private approximation algorithms for circular convolution. We prove 
that this bound is nearly tight for every fixed h in the following section. The 
lower bound is state as Theorem [5] 

Theorem 5. Let h G R N be an arbitrary real vector and let us relabel the 

Fourier coefficients o/h so that \h \ > . . . > | ^-iv 1 1 - For all sufficiently small e 

and 5, the expected mean squared error MSE of any (e, 5) -differentially private 
algorithm A that approximates h * x is at least 




MSE= sup -E[||^(x)-Hx|||]. 
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For the remainder of the paper, we define the notation specLB(h) for the 

right hand side of ©, i.e. specLB(h) = max^ =1 ~ ■ 

The proof of Theorcm[3]is based on recent work |23) connecting combinatorial 
discrepancy and privacy. Adapting a strategy due to Hardt and Talwar [T5] . 
we instantiate the basic discrepancy lower bound for any matrix PA, where P 
is a projection matrix, and use the maximum of these lower bounds. However, 
we need to resolve several issues that arise in the setting of (e, (^-differential 
privacy. While projection works naturally with the volume-based lower bounds 
of Hardt and Talwar, the connection between the discrepancy of A and PA is not 
immediate, since discrepancy is a combinatorially defined quantity. Our main 
technical contribution in this section is analyzing the discrepancy of PA via the 
determinant lower bound of Lovasz, Spencer, Vesztergombi. This approach was 
generalized and extended by Nikolov, Talwar, and Zhang [24] to show nearly 
optimal lower bounds for arbitrary linear maps. 

We start our presentation with preliminaries from prior work and then we 
develop our lower bounds for convolutions. 

3.1 Discrepancy Preliminaries 

We define (£2) hereditary discrepancy as 

herdisc(A) = max min llAvlU. 

W£[N] ve{-l,+l} v " 

The following result connects discrepancy and differential privacy: 

Theorem 6 ( [2Hj)- Let A be an M x N complex matrix and let A be an (e,S)- 
differentially private algorithm for sufficiently small constant e and 5. There 
exists a constant C and a vector x € {0,1}^ such that E[||^4(x) — Ax|||] > 

s~i herdisc(A) 
° log 2 N ■ 

The determinant lower bound for hereditary discrepancy due to Lovasz, 
Spencer, and Vesztergombi gives us a spectral lower bound on the noise required 
for privacy. 

Theorem 7 ( [20]). There exists a constant C such that for any complex 
M X N matrix A, hcrdisc(A) > C max^B V~K\ det(B)| 1 / if , where K ranges 
over [min{ M,N}] and B ranges over K x K submatrices of A.. 

Corollary 8. Let A be an M x N complex matrix and let A be an (s,8)- 
differentially private algorithm for sufficiently small constant e and S. There 
exists a constant C and a vector x £ {0, 1}^ such that, for any K x K submatrix 

B of A, E[M(x) - Ax|||] > C g 'yf ■ 

3.2 Proof of Theorem [5] 

We exploit the power of the determinant lower bound of Corollary [8] by com- 
bining the simple but very useful observation that projections do not increase 
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mean squared error with a lower bound on the maximum determinant of a sub- 
matrices of a rectangular matrix. We present these two ingredients in sequence 
and finish the section with a proof of Theorem [5J 

Lemma 1. Let A be an M x N complex matrix and let A be an (e, 8) -differentially 
private algorithm for sufficiently small constant e and 8. There exists a constant 
C and a vector x £ {0, 1} N such that for any L x M projection matrix P and 
for any K x K submatrix B of PA, E[||„4(x) - Ax|||] > C Kl t.g^T* • 

Proof. We show that there exists an (e, <5)-differentially private algorithm B that 
satisfies 

E[||B(x)-PAxO<E[||„4(x)-Ax||2]. (4) 

Then we can apply Corollary [5] to B and PA to prove the corollary. 

The algorithm B on input x outputs Py where y = A(x). Since B is 
a function of *4(x) only, it satisfies (e, <5)-differcntial privacy by Theorem (3) 
It satisfies (0| since for any y and any projection matrix P it holds that 
||P(y-Ax)|| 2 <||y-Ax|| 2 . ' □ 

Our main technical tool is a linear algebraic fact connecting the determinant 
lower bound for A and the determinant lower bound for any projection of A. 

Lemma 2. Let A be an M x N complex matrix with singular values Ai > . . . > 
Ajv and let P be a projection matrix onto the span of the left singular vectors 
corresponding to \\ , . . . , \k ■ There exists a constant C and K x K submatrix 
B of PA such that 




l/K 



Proof. Let C = PA and consider the matrix D = CC H . It has eigenvalues 
Af , . . . , }? K , and therefore 

K 

det(D) = IJ\ 2 - 

i=l 

On the other hand, by the Binct-Cauchy formula for the determinant, we have 
det(D) = dct(CC ff ) 

= J2 det ( c is) 2 

sr 6 ([jp) 

< | r , ) max det(Cls) 2 . 
" \KJ sr e ([£i) 

Rearranging and raising to the power 1/2K, we get that there exists a K x K 
submatrix of C such that 



N 



det(B)|V->(-) ([Ta 



-1/2K / K 



l/K 
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Using the bound (^) < (^) completes the proof. 



□ 



We can now prove our main lower bound theorem by combining Lemma [T] 
and Lemma [21 

of Theorem\5i As usual, we will express h * x as the linear map Hx, where 
H is the convolution matrix for h. By Lemma [TJ it suffices to show that for 
each K, there exists a projection matrix P and a K x K submatrix B of PH 
such that | dct(_B)| 1 / A ' > f^v^l^ifl). Recall that the eigenvalues of H are 
\f~Nho, ■ ■ ■ , y/NfiN-i, an d, therefore, the i-th singular value of H is \f~N\hi^i\. 
By Lemma[2j there exists a constant C , a projection matrix P, and a submatrix 
B of PH such that 



4 Upperbounds 

Standard (e, <5)-privacy techniques such as input perturbation or output pertur- 
bation in the time or in the frequency domain lead to mean squared error, at 
best, proportional to ||h||2. 

Next we describe an algorithm which is nearly optimal for (e, <5)-diffcrcntial 
privacy. This algorithm is derived by formulating the error of a natural class 
of private algorithms as a convex program and finding a closed form solution. 
An alternative solution that partitions the spectrum of H geometrically is de- 
scribed in Appendix [A] The class of algorithms we consider is those which add 
independent Gaussian noise to the Fourier coefficients of the private input x. 
Interestingly, we show that this simple strategy is nearly optimal for computing 
convolution maps. 

Consider the class of algorithms, which first add independent Laplacian noise 
variables z.; = Lap(0, b{) to the Fourier coefficients Xi to compute Xi = Xi + zi, 
and then output y = F^Hx. This class of algorithms is parameterized by the 
vector b = (bo, ... , bN-i)', a member of the class will be denoted .4(b) in the 
sequel. The question we address is: For given e, 8 > 0, how should the noise 
parameters b be chosen such that the algorithm .4(b) achieves (e, ^-differential 
privacy in x for £i neighbors, while minimizing the mean squared error MSE? 
It turns out that by convex programming duality we can derive a closed form 
expression for the optimal b, and moreover, the optimal -4(b) is nearly optimal 
among all (e, S) -differentially private algorithms. The optimal parameters are 
used in Algorithm [TJ 

Theorem 9. Algorithm^ satisfies (e, 5) -differential privacy, and achieves ex- 
pected mean squared error 




This completes the proof. 



□ 



MSE = 4 




(5) 
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Algorithm 1 Fourier Mechanism 



Set 7 = ^» 

Compute x = Fatx and h = Fjvx. 
for all i e {0, . . . , N - 1} do 
if \hi\ > then 

Set zi = Lap (y^J 

else if \hi\ =0 then 

Set = 
end if 

Set yi = y/NhiXi. 
end for 

Output y = y 



Moreover, Algorithm^ runs in time 0(N log TV). 

Before proving Theorem^ we show that it implies that Algorithm[T]is almost 
optimal for any given h. 

Theorem 10. For any h, Algorithm [7] satisfies (s, 5) -differential privacy and 
achieves expected mean squared error O ^specLB(h) los m£&JIlM}M \ _ 

Proof. Assume that \ho\ > \hi\ > ... > | /i. jv— 1 1 - Then, by definition of / = {0 < 
i<N-l: \hi\ > 0}, we have \hA = 0, for all j > \I\ - 1. Thus, 



= E \hi\=Y^Uh-i\ 




/ iVlogA^v / specLB(h) 

= i?| / |\/7Vlog^ v /spccLB(h), (6) 

where H m = Y2iLi 1 denotes the m-th harmonic number. Recalling that H m = 
O (log to), and combining the bound © with the expression of the MSE (|10[) 
yields the desired bound. □ 

of Theorem^ For running time, we note that our algorithm is no more expen- 
sive than computing a Fast Fourier Transform, which can be done in 0(N log N) 
arithmetic operations using the classical Cooley-Tukey algorithm, for example. 

Denote the set / = {0 < i < N — 1 : \hi\ > 0}. We formulate the problem of 
finding the algorithm A(h) which minimizes MSE subject to privacy constraints 
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as the following optimization problem: 



min Yb1\hi\ 2 (7) 
1 e 2 

s -*-Z! JVW = 21n(l/5) (8) 

6j>0,Vie/. (9) 

Next we justify this formulation. 

Privacy Constraint. We first show that the output y of an algorithm .4(b) 
is an (e, <5)-differentially private function of x, if the constraint (|5J| is satisfied. 
Denote y = Hx. If y is an (e, ^-differentially private function of x, then by 
Theorem [3J y is also (e, (^-differentially private, since the computation of y 
depends only on and y and not on x directly. Thus we can focus on the 
requirements on b for which y is [e, 8) private. 

If i (fc I, then yi = and does not affect privacy regardless of hi. Thus, we 
can set 6^ = for all i ^ I. If i £ I, we first characterize the ^-sensitivity of Xi as 
a function of x. Recall that Xi = f/*x is the inner product of x with the Fourier 
basis vector f;. The sensitivity of Xi is therefore ||/i||oo = ^j^j Then, by 
Theorem[2j ii — x% + Lap (0, hi) is e^-differentially private in x, with e, = ^ b ■ 

The computation of depends only on hi and Xi, thus, by Theorem [3j £/j is 
-differentially private in x. 

Finally, by Theorem^ y is (e, 8) differentially private for any 8 > 0, as long 
as constraint ((SJ holds. 

Accuracy Objective. We show that finding the algorithm .4(b) which 
minimizes the MSE is equivalent to finding the parameters hi > 0, i € I, which 
minimize the objective function ([7|). Note that y = F^Hx = F^H(Fatx + z) = 
y + F^Hz. Thus, the output y is unbiased: E[y] = y. The mean squared error 
is given by: 

MSE=1e[||F^Hz|| 2 ] 

= ^E[tr(F«Hzz H H ff F w )] 

= ltr(H 2 E[zz«]) = 25>| 2 6 2 , 

iei 

which yields the desired objective function ([7|)- 

Closed Form Solution. The program (JT])-© is convex in 1/bf. Using the 
KKT conditions of this program, we can derive a closed form optimal solution: 



b* = Y / (21n(l/<5)||h||i)/(iV£ 2 |/i i |) when i e I and b* = otherwise. Substituting 
these values back into the objective finishes the proof. Full details of the analysis 
of the convex program can be found in Appendix |B] □ 
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5 Generalizations and Applications 



In this section we describe some generalizations and applications of our lower 
bounds and algorithms for private convolution. 

5.1 Compressible Convolutions 

A case of special interest is convolutions h * x where h is a compressible se- 
quence. Such cases appear in practice in signal processing. For compressible 
h wc can show that Algorithm [1] outperforms input and output perturbation. 
First we present a definition of compressible sequences and then we give the im- 
proved upper bounds. A specific example of private compressible convolutions 
is developed in Section [5.41 in the context of computing marginal queries. 

Definition 6. A vector h £ R N is (c,p)- compressible (in the Fourier basis) if 
it satisfies: 

VO < i < N - 1 : \tiA 2 < c- — . 

- - 1 11 ~ (i + 1)p 

Theorem 11. Let h be a (c,p)- compressible vector for some constant p > 
2. Then Algorithmic satisfies {£,8) -differential privacy and achieves expected 
mean squared error O ^ c lo s j^ 1 "^ 1 ^ ^ for p = 2 and for p / 2 achieves 



O 



( cp\ 2 log(l/S) \ 
\p-2 J Ne' 2 I ■ 



Notice that the bound on squared error improves on input and output per- 
turbation by a factor O(jj). 

The proof of Theorem 1111 follows from Theorem [5] and the following lemma. 

Lemma 3. Let h be a (c,p)- compressible vector for some p > 1. Then, we have 

N-l 



inik=i:N<{ c(1 ii n 

i=0 L p-2> 



c(l + lnA), ifp = 2 
ifp>2 



Proof. Approximating a sum by an integral in the usual way, for < a < b and 
P > 2, we have 

^ (i + lW 2 ~~ ^ 

i— a i— a+1 

1 f b+1 dx 

< 



(a + l)P/ 2 J a+1 xp/i 
Bounding the integral on the right hand side, we get 



(i + l)P/a-] 1 + (p/2_i)(a + i)P/2-i i ifp>2 



The lemma then follows from the definition of (c,p)-comprcssibility. □ 
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5.2 Running Sum 

Running sums can be denned as the circular convolution x' * h of the se- 
quences h = (1, . . . , 1,0, . . . ,0), where there are N ones and N zeros, and 
x' = (x, 0, ... ,0), where the private input x is padded with TV zeros. An el- 
ementary computation reveals that hi = VJf and hi = O^N -1 / 2 ) for all i > 1. 
By Theorem [9j Algorithm Q] computes running sums with mean squared error 
0(1) (ignoring dependence on e and S), improving on the bounds of in 
the mean squared error regime. 

5.3 Linear Filters in Time Series Analysis 

Linear filtering is a fundamental tool in analysis of time-series data. A time 
series is modeled as a sequence x = (xt)fl_ ao , supported on a finite set of time 
steps. A filter converts the time series into another time series. A linear filter 
does so by computing the convolution of x with a series of filter coefficients 
w, i.e. computing y t = YmL-oq w % x t-%- For a finitely supported x, y can be 
computed using circular convolution by restricting x to its support set and 
padding with zeros on both sides. 

We consider the case where a; is a time series of sensitive events. Each 
element Xi is a count of events or sum of values of individual transactions that 
have occurred at time step i. When we deal with values of transactions, we 
assume that individual transactions have much smaller value than the total. We 
emphasize that the definition of differential privacy with respect to x defined 
this way corresponds to event-level privacy. Semantically, this guarantee implies 
that even an adversary who has arbitrary information about all but a single 
event of interest cannot find out with certainty whether the event of interest 
has occur-ed. This guarantee is weaker than the user-level guarantee, which 
implies that knowing all events related to all but a single user of interest provides 
little information about the user. The user-level guarantee would unfortunately 
require excessive noise for filtering time series data, as the sensitivity of the 
convolution query becomes unbounded. On the other hand, the event-level 
guarantee is often sufficient, specifically in settings when sensitive events occur 
only infrequently. 

We consider applications to financial analysis, but our methods are appli- 
cable to other instances of time series data, e.g. we may also consider network 
traffic logs or a time series of movie ratings on an online movie streaming ser- 
vice. We can perform almost optimal differentially private linear filtering by 
casting the filter as a circular convolution. Next we briefly describe a couple of 
applications of private linear filtering to financial analysis. For more references 
and detailed description, we refer the reader the book of Gengan, Selcuk, and 
Whitcher [11]. 

Volatility Estimation. The value at risk measure is used to estimate 
the potential change in the value of a good or financial instrument. Assume, for 
example, that in an online advertising system we would like to estimate potential 
changes in the number of clicks per day for a set of display ad campaigns, and 
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denote by Xi the number of clicks on day i from the start of the campaigns. The 
sensitive event is assumed to be a single ad click, for example a click on an ad for 
a type of medical treatment. In order to estimate volatility, we need to estimate 
a measure of the deviation of the Xi for a given time period [t — W + 1, t]. It is 
appropriate to take older fluctuations with less significance. One way to do this 
is by using linear filtering of the time series of absolute deviations in the click 
counts: 

^ w-i 

" X z \x t - t - x t -i\, 



2-^=1 A »=o 

where A is a decay parameter and x,t is the average count over [t — W + 1, t] . 
The quantity x% is itself given by the convolution -At XL =0 1 x t—i ano - can be 
computed nearly optimally using Algorithm [TJ Given the sequence x, we can 
construct the time series = (|xj — £,-|)t- Using the triangle inequality, one 
can verify that for a fixed value of x, \\y — y'\\i < \\x — x'\\i, and therefore an 
algorithm which is differentially private with respect to y is also differentially 
private with respect to x. Therefore, we can use Algorithm [1] to estimate a e 
with nearly optimal mean squared error. 

Computing x was treated in [3j as the window sums problem, together with 
other decayed sum problems. The quantity a e is an exponentially decayed sum 
computed over a window and can be approximated under e-diffcrcntial privacy 
using the methods of [3]. However, as noted above, Algorithm [I] gives improved 
mean squared error guarantees for window sums, as well as a near-optimality 
guarantee. 

Business Cycle Analysis. The goal of business cycle analysis is to extract 
cyclic components in the time series and smooth-out spurious fluctuation. Two 
classical methods for business-cycle analysis are the Hodrick-Prescott filter and 
the Baxter-King filter. Here we briefly sketch the form of the Hodrick-Prescott 
(HP) filter. Let us take the example of time series x of ad clicks again, with a 
single component Xi giving number of clicks on a set of ads per day or per hour. 
We can use the HP filter to detect cyclical trends in ad clicking activity. The 
filtered-out cyclical (smooth) component of the data extracted by the HP filter 
can be written as a convolution of the following form: 

\3=0 

Above, A is a smoothing parameter: the larger A is, the more the data is 
smoothed by the filter; 6i and Ai arc functions of A. In principle, this is a 
convolution of infinite time series, but in practice we truncate the series to a 
finite length. 

5.4 Generalized Marginal Queries 

Marginal queries are a class of queries posed to d-attribute binary databases, 
i.e. databases where each row of the database is associted with a d-bit binary 
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vector, corresponding to the values of d binary attributes. A marginal query is 
specified by a setting a £ {0, l} d of the d attributes and a subset S C [d] of k 
attributes; the exact answer to the query is the number of rows in the database 
consistent with a on S . In this subsection we address the error required to 
privately answer a natural generalization of marginal queries. A generalized 
marginal query is specified by a setting a £ {0, l} d of the d attributes and a 
iu-DNF h and the exact answer is the number of rows b £ {0, l} d in the private 
database for which ft, (a b) is satisfied (here is componentwise XOR). In 
the case of traditional marginal queries the DNF h is a single disjunction of k 
unnegated variables. Generalized marginals however allow more complex queries 
such as, for example, "show all users who agree with a on «i and at least one 
other attribute" . 

More formally, we encode a binary d-attribute database in histogram rep- 
resentation as a function x : {0, l} d —> [n]. The value of x(a) for a £ {0, l} d 
corresponds to the number of rows in the database with attribute setting a, and 
n is the database size. 

Definition 7. Let h(c) be a w-DNF given by h(c) = (£1,1 A ... A £i >w ) V ... V 
(i Sl i A ... A £ s .w), where £ij is a literal, i.e. either c p or c p for some p £ [d]. 
The generalized marginal function for h and a database x : {0, l} d — > [n] is a 
function (x * h) : {0, l} d — > [n] defined by 

(x*h)(a)= x(b)h(a®b). 
be{o,i} d 

The overload of notation for x*h here is on purpose as generalized marginals 
can be interpreted as an instance of a generalization of circular convolutions. 
In particular, circular convolutions are associated naturally with the group of 
addition modulo N, while generalized marginals are an instance of convolutions 
associated with the group of addition modulo 2 of d-dimensional binary vectors 
(formally (Z/2Z) d ). Moreover, there is a Fourier transform that diagonalizes 
convolutions over (Z/2Z) d and that shares all properties with the transform de- 
fined in Scction[2]which are necessary for our lower and upper bound arguments. 
In particular, we need that any component of any Fourier basis vector has norm 
1 / \/N, which is true for the Fourier transform diagonalizing convolutions over 
(1i/21i) d . Therefore, we can privately approximate gcnerelized marginal queries 
using Algorithm [TJ and, furthermore, our analysis of the privacy and accuracy 
guarantees for the algorithm still holds. Using results from learning theory on 
the spectral concentration of bounded width DNFs and the bound from Sec- 
tion 15. 1[ we can show that Algorithm [1] gives non-trivial error for generalized 
marginal queries. 

Theorem 12. Let h be a w-DNF and x : {0, l} d — > [n] be a private database. 
Algorithm [7] satisfies (e, S) -differential privacy and computes the generalized 
marginal x*h for h and and x with mean squared error bounded by Q( log ^/' 5 ) 2 d ( 1 ~ 1 /°( w io e w ))y 

In addition to this explicit bound, we also know (by Theorem I14p that up 
to a factor of d 4 , Algorithm Q] is optimal for computing generalized marginal 
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functions. Notice that error bound we proved improves on randomized response 
by a factor of 2~°( <i /( u,lo s iu )); interestingly this factor is independent of the size 
of the w-DNF formula. 

In related work, Hardt et al. [2] considered database queries that can be 
computed by an ACO circuit. Generalized marginal queries can be computed 
by a two-layer ACO circuit. However, our results are incomparable to theirs, as 
they consider the setting where the database is of bounded size ||x|ji < n and 
our error bounds are independent of ||x||i. Our error bounds improve on the 
bounds of [TJ] when the database is large enough so that our error bound is 
sublinear in database size. 

The proof of Theorem [T^] follows from Lemma 0] and the following con- 
centration result for the spectrum of w-DNF formulas, originally proved by 
Mansour [22] in the context of learning under the uniform distribution. 

Theorem 13 ( |32]). Let h : {0, l} d -> {0, 1} be a w-DNF. Let T C 2^ be the 

index set of the top 2 d ~ k Fourier coefficients of h. Then, 

J2 \HS)\ 2 < 2 d+ "^°i™K 

6 Conclusion 

We derive nearly tight upper and lower bounds on the error of (e, (5)-diffcrcntially 
private for computing convolutions. Our lower bounds rely on recent general 
lower bounds based on discrepancy theory and elementary linear algebra; our 
upper bound is a simple computationally efficient algorithm. We also sketch 
several applications of private convolutions, in time series analysis and in com- 
puting generalizes marginal queries on a d-attribute database. 

Our results are nearly optimal for any h when the database size is large 
enough with respect to the number of queries. In some settings it is reason- 
able to assume however that database size is much smaller, and our algorithms 
give suboptimal error for such sparse databases. Nearly optimal algorithms for 
computing a workload of M linear queries posed to a database of size at most 
n were given in [24], but their algorithm has running time at least 0(M 2 Nn). 
Since our dense case algorithm for computing convolutions has running time 
0(N log TV), an interesting open problem is to give an algorithm with running 
time 0(Nn polylog(./V, n)) for computing convolutions with optimal error when 
the database size is at most n. 
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A Spectrum Partitioning Algorithm 



We partition the spectrum of the convolution matrix H into geometrically grow- 
ing in size groups and adds different amounts of noise to each group. Noise is 
added in the Fourier domain, i.e. to the Fourier coefficients of the private input 
x. The most noise is added to those Fourier coefficients which correspond to 
small (in absolute value) coefficients of h, making sure that privacy is satis- 
fied while the least amount of noise is added. In the analysis of optimality, we 
show that the noise added to each group can be charged to the lower bound 
specLB(h). Because the number of groups is logarithmic in N, we get almost 
optimality. This analysis is inspired by the work of Hardt and Talwar |15j . 
However, our algorithm is simpler and significantly more efficient. 

The (e, <5)-diffcrcntially private algorithm we propose for approximating h * 
x is shown as Algorithm [2] In the remainder of this section we assume for 
simplicity that N is a power of 2. We also assume, for ease of notation, that 
\h(j\ > ... > |/ijv-i|- Our algorithm and analysis do not depend on i except as 
an index, so this comes without loss of generality. 

Algorithm 2 SpectralPartition 

get n = v^a+iogAQinq/.s) 

Compute x = F^x and h = Fjvx. 

x = x + Lap(?7) 

for all k € [1, log AT] do 

for all i e [N/2 k , N/2 k - 1 - 1] do 
Set i l = Xi + Lap(n2~ k / 2 ). 
Set j/i = y/NhiXi. 
end for 
end for 

Output y = F^y 



Lemma 4. Algorithm^ satisfies (e, S) -differential privacy. Also, there exists 
an absolute constant C such that Algorithm achieves expected mean squared 
error 

MSE<c (i±i^MlM (|^ +E ^ E lhn {10) 

fc=l i=N/2 k 

Proof. Privacy. We claim that x is an (e, <5)-differentially private function of 
x. The other computations depend only on h and x and not on x directly, so, 
by Theorem [3l incur no loss in privacy. 

First we analyze the sensitivity of each Fourier coefficient x%. As a function 
of x, Xi is an inner product of x with a Fourier basis vector. Let that vector be 
f and let x, x' be two neighboring inputs, i.e. |jx — x'||i < 1. Then we have 

|f H (x-x')| < UflUlx-x'Ui < -i= 
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Therefore, by Theorem[2j when i G [N/2 k , N/2 k ~ 1 -l], x, is (^=^, 0)-differentially 
private. By Theorem 01 x is (s',S) differentially private for any 6 > 0, where 

1 logAf N 9 k 

= 2lW + '° sJV =£ * 

Accuracy. Observe ~E\xi\ = ;r, since we add unbiased Laplace noise to each 
Xi. Also, the variance of Lap(r;2- fe / 2 ) is 27? 2 2- fc . Therefore, E[y 4 ] = \HVhiXi and 
the variance of when i G [N/2 k ,N/2 k - 1 - 1] is 0(JV|^-|V 2_fc )- By linearity 
of expectation, E[F^y] = Hx. Adding variances for each k and dividing by N, 
we get the right hand side of (fTU|) . The proof is completed by observing that 
the inverse Fourier transform is an isometry for the £2 norm, so does not 
change mean squared error. □ 

Theorem 14. For any h, Algorithm ^satisfies (e,S)- differential privacy and 
achieves expected mean squared error 0( S pecLB(h) los Jvl 2 n < 1 ^ ). 

Proof. By Lemma 2J we know that 

„ SE < C WM ,|i,|. + £ " |i W2 ._,_ lP , _ ( 8 pecLB(h) '^ W ^'W ). 

fc=l 

□ 



B Closed Form Solution for the Optimal A(h) 

We derive a closed form solution of (JT])-© using convex programming duality. 
Let us first rewrite the program by substituting otj = 1/bf: 



min > 

Ne 2 (11) 



s.t. > a, - 

a, > 0, Vi G i". 

The Lagrangian is 

^>=£«+"(l>-^)-5> 



(12) 
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The KKT conditions are given by 



\h I 2 

Vi el, - +u-Xi = 



A* a* = 

a ?; > 0, Ai > 

The following solution (a*, i/*, A*) satisfies the KKT conditions, and is thus the 
optimal solution to (fTTj) 

vieJ , ^. A .. , ^. / 21„(W||h||, y 

21n(l/5)||h||i \ We a J 

(14) 

Consequently, the optimal noise parameters b for the original problem (|T|)-(p?|), 
and the associated MSE are 



Ne 2 \hi\ 

Hi 4 I 



MSE*=2^|/ ll | 2 6,H4^||h||? 
which are the noise parameters and MSE of Algorithm Q] 



(15) 

^llhll 2 . 

e 2 N 
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